Unit Test Audit by jgieringer · Pull Request #100 · SpringCare/VERA-MH

jgieringer · 2026-02-04T23:54:13Z

Description

This PR cleans, (hopefully) improves, restructures, and adds to the unit testing suite.
The biggest change to note is under tests/unit/llm_clients/test_base_llm.py where TestLLMBase and TestJudgeLLMBase are added as abstract testing classes to ensure their subclasses are properly implemented and tested across similar situations.

I first tried just iterating over every subclass and testing accordingly, but that became a MASSIVE file with conditions that didn't always align as each subclass is just enough different 🥳

Example of creating a new llm client without any tests:

/Users/josh.gieringer/Projects/VERA-MH/tests/unit/llm_clients/test_coverage.py::TestLLMCoverage::test_all_llm_implementations_have_test_files failed: tests/unit/llm_clients/test_coverage.py:168: in test_all_llm_implementations_have_test_files
    assert not missing_tests, (
E   AssertionError: 
E     
E     Missing test files for LLM implementations:
E       - CoolLLM should have test_cool_llm.py
E     
E     All LLM implementations must have corresponding test files.
E   assert not [('CoolLLM', 'test_cool_llm.py')]

Issue

Resolves SAF-145

…ubclasses

Copilot

Pull request overview

This PR refactors and expands the unit test suite with a focus on LLM client testing. The main improvement is introducing abstract base test classes (TestLLMBase and TestJudgeLLMBase) that define common test patterns for all LLM implementations, along with helper functions and shared fixtures to reduce code duplication.

Changes:

Introduces base test classes and helper functions for consistent LLM testing across all providers
Refactors all LLM client tests to inherit from base classes
Adds coverage validation tests to ensure complete test coverage for new implementations
Consolidates last_response_metadata initialization in LLMInterface base class
Extracts parse_judge_models and get_parser() functions from judge.py for better testability
Adds validation for max_personas parameter and new test cases

Reviewed changes

Copilot reviewed 32 out of 32 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
tests/unit/llm_clients/test_base_llm.py	New abstract base test classes for LLM implementations
tests/unit/llm_clients/test_helpers.py	New helper functions for metadata assertions and mock verification
tests/unit/llm_clients/conftest.py	New shared fixtures for mock responses and conversation histories
tests/unit/llm_clients/test_coverage.py	New automated coverage validation tests
tests/unit/llm_clients/README.md	New comprehensive documentation for test architecture
tests/unit/llm_clients/test_*_llm.py	Refactored to inherit from base classes and use helpers
llm_clients/llm_interface.py	Moved last_response_metadata to base class
llm_clients/*_llm.py	Removed duplicate last_response_metadata declarations
llm_clients/azure_llm.py	Added role field to metadata
judge.py	Extracted get_parser() and uses parse_judge_models()
judge/utils.py	New parse_judge_models function
tests/unit/judge/test_judge_cli.py	Complete rewrite with comprehensive CLI tests
tests/unit/judge/test_*.py	Updated to use proper CLI arg parsing patterns
generate_conversations/utils.py	Added max_personas validation
tests/unit/conftest.py	New shared mock_system_message fixture

Comments suppressed due to low confidence (1)

tests/unit/llm_clients/test_azure_llm.py:490

This method requires 3 positional arguments, whereas overridden TestJudgeLLMBase.test_generate_structured_response_success requires 2.

    async def test_generate_structured_response_success(
        self, mock_azure_config, mock_azure_model
    ):

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/unit/conftest.py

tests/unit/llm_clients/test_azure_llm.py

Copilot

Pull request overview

Copilot reviewed 35 out of 35 changed files in this pull request and generated 7 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/unit/llm_clients/test_azure_llm.py

tests/unit/llm_clients/test_claude_llm.py

tests/unit/llm_clients/test_gemini_llm.py

tests/unit/llm_clients/test_openai_llm.py

Copilot

Pull request overview

Copilot reviewed 35 out of 35 changed files in this pull request and generated 7 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/unit/llm_clients/test_azure_llm.py

tests/unit/llm_clients/test_claude_llm.py

tests/unit/llm_clients/test_gemini_llm.py

tests/unit/llm_clients/test_openai_llm.py

judge/rubric_config.py

tests/unit/judge/test_rubric_config.py

tests/unit/judge/test_utils.py

tests/unit/llm_clients/conftest.py

tests/unit/llm_clients/README.md

emily-vanark

My personal qualms about testing mocks aside, this increases our test coverage from 625 tests with 75.7% coverage to 718 tests with 79.4% converage, and they all pass, so... well done!

tests/unit/llm_clients/test_base_llm.py

generate_conversations/utils.py

judge/rubric_config.py

llm_clients/azure_llm.py

tests/unit/judge/test_llm_judge.py

tests/unit/judge/test_rubric_config.py

tests/unit/llm_clients/test_coverage.py

utils/utils.py

jgieringer added 15 commits February 2, 2026 16:26

clean up convo sim tests

c4c9b71

update generation util tests

478f9bf

parse_judge_models helper

315b152

improve judge parse tests

d8919ce

update judge extra param tests

6ed3d37

update runner extra param tests

509b21a

updated tests for llm judge

5258199

Merge branch 'main' into jgieringer/unit-testing

65b27c0

add rubric assign end asset

01eee65

repurpose judge cli tests into utils

a2fe619

test overall judge script

df0ef96

add last_response_metadata to LLMInterface init

cdf27d4

add role to azure_llm metadata

2114a6f

ensure llm clients are tested + add base tests for llm and judgellm s…

166f0e7

…ubclasses

cleaning errors

cd2dfbb

jgieringer requested review from Copilot, emily-vanark, nz-1 and sator-labs February 4, 2026 23:54

Copilot started reviewing on behalf of jgieringer February 4, 2026 23:54 View session

ignore abstract test class warnings

90fe5ac

Copilot AI reviewed Feb 4, 2026

View reviewed changes

tests/unit/conftest.py Show resolved Hide resolved

tests/unit/llm_clients/test_azure_llm.py Outdated Show resolved Hide resolved

tests/unit/llm_clients/test_azure_llm.py Outdated Show resolved Hide resolved

jgieringer added 3 commits February 5, 2026 11:40

reduce # mock azure configs

9dc2da2

ensure rubric structure

99f23a8

use conftest fixtures over patches

f5316e9

jgieringer requested a review from Copilot February 5, 2026 21:26

Copilot started reviewing on behalf of jgieringer February 5, 2026 21:26 View session

Copilot AI reviewed Feb 5, 2026

View reviewed changes

apply usefixtures at class level

7c43956

jgieringer requested a review from Copilot February 5, 2026 22:30

Copilot started reviewing on behalf of jgieringer February 5, 2026 22:31 View session

Copilot AI reviewed Feb 5, 2026

View reviewed changes

match base method signatures for the override

d40c38e